large-scale prediction
Scoreformer: A Surrogate Model For Large-Scale Prediction of Docking Scores
Ciudad, Álvaro, Morales-Pastor, Adrián, Malo, Laura, Filella-Mercè, Isaac, Guallar, Victor, Molina, Alexis
In this study, we present ScoreFormer, a novel graph transformer model designed to accurately predict molecular docking scores, thereby optimizing high-throughput virtual screening (HTVS) in drug discovery. The architecture integrates Principal Neighborhood Aggregation (PNA) and Learnable Random Walk Positional Encodings (LRWPE), enhancing the model's ability to understand complex molecular structures and their relationship with their respective docking scores. This approach significantly surpasses traditional HTVS methods and recent Graph Neural Network (GNN) models in both recovery and efficiency due to a wider coverage of the chemical space and enhanced performance. Our results demonstrate that ScoreFormer achieves competitive performance in docking score prediction and offers a substantial 1.65-fold reduction in inference time compared to existing models. We evaluated ScoreFormer across multiple datasets under various conditions, confirming its robustness and reliability in identifying potential drug candidates rapidly.
- Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.05)
- Europe > Germany > Rheinland-Pfalz > Mainz (0.04)
- Asia > Middle East > Jordan (0.04)
Large-Scale Prediction of Disulphide Bond Connectivity
The formation of disulphide bridges among cysteines is an important fea- ture of protein structures. Here we develop new methods for the predic- tion of disulphide bond connectivity. We first build a large curated data set of proteins containing disulphide bridges and then use 2-Dimensional Recursive Neural Networks to predict bonding probabilities between cys- teine pairs. These probabilities in turn lead to a weighted graph matching problem that can be addressed efficiently. We show how the method con- sistently achieves better results than previous approaches on the same validation data.
Non-Linear Label Ranking for Large-Scale Prediction of Long-Term User Interests
Djuric, Nemanja (Yahoo! Labs) | Grbovic, Mihajlo (Yahoo! Labs) | Radosavljevic, Vladan (Yahoo! Labs) | Bhamidipati, Narayan (Yahoo! Labs) | Vucetic, Slobodan (Temple University)
We consider the problem of personalization of online services from the viewpoint of ad targeting, where we seek to find the best ad categories to be shown to each user, resulting in improved user experience and increased advertiser's revenue. We propose to address this problem as a task of ranking the ad categories depending on a user's preference, and introduce a novel label ranking approach capable of efficiently learning non-linear, highly accurate models in large-scale settings. Experiments on real-world advertising data set with more than 3.2 million users show that the proposed algorithm outperforms the existing solutions in terms of both rank loss and top-K retrieval performance, strongly suggesting the benefit of using the proposed model on large-scale ranking problems.
- North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
- North America > United States > California > Santa Clara County > Sunnyvale (0.04)
- Marketing (0.93)
- Information Technology (0.68)